452 research outputs found

    The CMA Evolution Strategy: A Tutorial

    Full text link
    This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized, method for real-parameter (continuous domain) optimization of non-linear, non-convex functions. We try to motivate and derive the algorithm from intuitive concepts and from requirements of non-linear, non-convex search in continuous domain.Comment: ArXiv e-prints, arXiv:1604.xxxx

    CMA-ES with Two-Point Step-Size Adaptation

    Get PDF
    We combine a refined version of two-point step-size adaptation with the covariance matrix adaptation evolution strategy (CMA-ES). Additionally, we suggest polished formulae for the learning rate of the covariance matrix and the recombination weights. In contrast to cumulative step-size adaptation or to the 1/5-th success rule, the refined two-point adaptation (TPA) does not rely on any internal model of optimality. In contrast to conventional self-adaptation, the TPA will achieve a better target step-size in particular with large populations. The disadvantage of TPA is that it relies on two additional objective functio

    Injecting External Solutions Into CMA-ES

    Get PDF
    This report considers how to inject external candidate solutions into the CMA-ES algorithm. The injected solutions might stem from a gradient or a Newton step, a surrogate model optimizer or any other oracle or search mechanism. They can also be the result of a repair mechanism, for example to render infeasible solutions feasible. Only small modifications to the CMA-ES are necessary to turn injection into a reliable and effective method: too long steps need to be tightly renormalized. The main objective of this report is to reveal this simple mechanism. Depending on the source of the injected solutions, interesting variants of CMA-ES arise. When the best-ever solution is always (re-)injected, an elitist variant of CMA-ES with weighted multi-recombination arises. When \emph{all} solutions are injected from an \emph{external} source, the resulting algorithm might be viewed as \emph{adaptive encoding} with step-size control. In first experiments, injected solutions of very good quality lead to a convergence speed twice as fast as on the (simple) sphere function without injection. This means that we observe an impressive speed-up on otherwise difficult to solve functions. Single bad injected solutions on the other hand do no significant harm.Comment: No. RR-7748 (2011

    Linear Convergence of Comparison-based Step-size Adaptive Randomized Search via Stability of Markov Chains

    Get PDF
    In this paper, we consider comparison-based adaptive stochastic algorithms for solving numerical optimisation problems. We consider a specific subclass of algorithms that we call comparison-based step-size adaptive randomized search (CB-SARS), where the state variables at a given iteration are a vector of the search space and a positive parameter, the step-size, typically controlling the overall standard deviation of the underlying search distribution.We investigate the linear convergence of CB-SARS on\emph{scaling-invariant} objective functions. Scaling-invariantfunctions preserve the ordering of points with respect to their functionvalue when the points are scaled with the same positive parameter (thescaling is done w.r.t. a fixed reference point). This class offunctions includes norms composed with strictly increasing functions aswell as many non quasi-convex and non-continuousfunctions. On scaling-invariant functions, we show the existence of ahomogeneous Markov chain, as a consequence of natural invarianceproperties of CB-SARS (essentially scale-invariance and invariance tostrictly increasing transformation of the objective function). We thenderive sufficient conditions for \emph{global linear convergence} ofCB-SARS, expressed in terms of different stability conditions of thenormalised homogeneous Markov chain (irreducibility, positivity, Harrisrecurrence, geometric ergodicity) and thus define a general methodologyfor proving global linear convergence of CB-SARS algorithms onscaling-invariant functions. As a by-product we provide aconnexion between comparison-based adaptive stochasticalgorithms and Markov chain Monte Carlo algorithms.Comment: SIAM Journal on Optimization, Society for Industrial and Applied Mathematics, 201

    Linear Convergence on Positively Homogeneous Functions of a Comparison Based Step-Size Adaptive Randomized Search: the (1+1) ES with Generalized One-fifth Success Rule

    Get PDF
    In the context of unconstraint numerical optimization, this paper investigates the global linear convergence of a simple probabilistic derivative-free optimization algorithm (DFO). The algorithm samples a candidate solution from a standard multivariate normal distribution scaled by a step-size and centered in the current solution. This solution is accepted if it has a better objective function value than the current one. Crucial to the algorithm is the adaptation of the step-size that is done in order to maintain a certain probability of success. The algorithm, already proposed in the 60's, is a generalization of the well-known Rechenberg's (1+1)(1+1) Evolution Strategy (ES) with one-fifth success rule which was also proposed by Devroye under the name compound random search or by Schumer and Steiglitz under the name step-size adaptive random search. In addition to be derivative-free, the algorithm is function-value-free: it exploits the objective function only through comparisons. It belongs to the class of comparison-based step-size adaptive randomized search (CB-SARS). For the convergence analysis, we follow the methodology developed in a companion paper for investigating linear convergence of CB-SARS: by exploiting invariance properties of the algorithm, we turn the study of global linear convergence on scaling-invariant functions into the study of the stability of an underlying normalized Markov chain (MC). We hence prove global linear convergence by studying the stability (irreducibility, recurrence, positivity, geometric ergodicity) of the normalized MC associated to the (1+1)(1+1)-ES. More precisely, we prove that starting from any initial solution and any step-size, linear convergence with probability one and in expectation occurs. Our proof holds on unimodal functions that are the composite of strictly increasing functions by positively homogeneous functions with degree α\alpha (assumed also to be continuously differentiable). This function class includes composite of norm functions but also non-quasi convex functions. Because of the composition by a strictly increasing function, it includes non continuous functions. We find that a sufficient condition for global linear convergence is the step-size increase on linear functions, a condition typically satisfied for standard parameter choices. While introduced more than 40 years ago, we provide here the first proof of global linear convergence for the (1+1)(1+1)-ES with generalized one-fifth success rule and the first proof of linear convergence for a CB-SARS on such a class of functions that includes non-quasi convex and non-continuous functions. Our proof also holds on functions where linear convergence of some CB-SARS was previously proven, namely convex-quadratic functions (including the well-know sphere function)

    Markov Chain Analysis of Evolution Strategies on a Linear Constraint Optimization Problem

    Get PDF
    This paper analyses a (1,λ)(1,\lambda)-Evolution Strategy, a randomised comparison-based adaptive search algorithm, on a simple constraint optimisation problem. The algorithm uses resampling to handle the constraint and optimizes a linear function with a linear constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using path length control. We exhibit for each case a Markov chain whose stability analysis would allow us to deduce the divergence of the algorithm depending on its internal parameters. We show divergence at a constant rate when the step-size is constant. We sketch that with step-size adaptation geometric divergence takes place. Our results complement previous studies where stability was assumed.Comment: Amir Hussain; Zhigang Zeng; Nian Zhang. IEEE Congress on Evolutionary Computation, Jul 2014, Beijing, Chin

    Markov Chain Analysis of Cumulative Step-size Adaptation on a Linear Constrained Problem

    Get PDF
    This paper analyzes a (1, λ\lambda)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.Comment: Evolutionary Computation, Massachusetts Institute of Technology Press (MIT Press): STM Titles, 201

    Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles

    Get PDF
    We present a canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space XX into a continuous-time black-box optimization method on XX, the \emph{information-geometric optimization} (IGO) method. Invariance as a design principle minimizes the number of arbitrary choices. The resulting \emph{IGO flow} conducts the natural gradient ascent of an adaptive, time-dependent, quantile-based transformation of the objective function. It makes no assumptions on the objective function to be optimized. The IGO method produces explicit IGO algorithms through time discretization. It naturally recovers versions of known algorithms and offers a systematic way to derive new ones. The cross-entropy method is recovered in a particular case, and can be extended into a smoothed, parametrization-independent maximum likelihood update (IGO-ML). For Gaussian distributions on Rd\mathbb{R}^d, IGO is related to natural evolution strategies (NES) and recovers a version of the CMA-ES algorithm. For Bernoulli distributions on {0,1}d\{0,1\}^d, we recover the PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm for optimization on {0,1}d\{0,1\}^d. All these algorithms are unified under a single information-geometric optimization framework. Thanks to its intrinsic formulation, the IGO method achieves invariance under reparametrization of the search space XX, under a change of parameters of the probability distributions, and under increasing transformations of the objective function. Theory strongly suggests that IGO algorithms have minimal loss in diversity during optimization, provided the initial diversity is high. First experiments using restricted Boltzmann machines confirm this insight. Thus IGO seems to provide, from information theory, an elegant way to spontaneously explore several valleys of a fitness landscape in a single run.Comment: Final published versio

    Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization Heuristics: Profiling (1+λ)(1+\lambda) EA Variants on OneMax and LeadingOnes

    Full text link
    Theoretical and empirical research on evolutionary computation methods complement each other by providing two fundamentally different approaches towards a better understanding of black-box optimization heuristics. In discrete optimization, both streams developed rather independently of each other, but we observe today an increasing interest in reconciling these two sub-branches. In continuous optimization, the COCO (COmparing Continuous Optimisers) benchmarking suite has established itself as an important platform that theoreticians and practitioners use to exchange research ideas and questions. No widely accepted equivalent exists in the research domain of discrete black-box optimization. Marking an important step towards filling this gap, we adjust the COCO software to pseudo-Boolean optimization problems, and obtain from this a benchmarking environment that allows a fine-grained empirical analysis of discrete black-box heuristics. In this documentation we demonstrate how this test bed can be used to profile the performance of evolutionary algorithms. More concretely, we study the optimization behavior of several (1+λ)(1+\lambda) EA variants on the two benchmark problems OneMax and LeadingOnes. This comparison motivates a refined analysis for the optimization time of the (1+λ)(1+\lambda) EA on LeadingOnes
    corecore